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Abstract. We undertake an extensive numerical investigation of the 
graph spectra of thousands regular graphs, a set of random Erdos-Renyi 
graphs, the two most popular types of complex networks and an evolving 
genetic network by using novel conceptual and experimental tools. Our 
objective in so doing is to contribute to an understanding of the meaning 
of the Eigenvalues of a graph relative to its topological and information- 
theoretic properties. We introduce a technique for identifying the most 
informative Eigenvalues of evolving networks by comparing graph spec¬ 
tra behavior to their algorithmic complexity. We suggest that extending 
techniques can be used to further investigate the behavior of evolving 
biological networks. In the extended version of this paper we apply these 
techniques to seven tissue specific regulatory networks as static example 
and network of a naive pluripotent immune cell in the process of differ¬ 
entiating towards a Thl7 cell as evolving example, finding the most and 
least informative Eigenvalues at every stage. 

Keywords: network science; graph spectra behavior; algorithmic proba¬ 
bility; information content; algorithmic complexity; Eigenvalues meaning 


1 Background 

The analysis of large networks raises in many of research fields, the ubiquity of 
large networks makes the analysis of the common properties of these networks 
important.In the most simplistic way can be seen or analyzed as a collection of 
vertices and edges but there are a very different way of representing the graph, 
using the eigenvalues and eigenvectors of matrices associated with the graph 
(Graph Spectra) rather than the vertices and edges themselves.In this study 
a graph or network G defined by pairs (U(G), E(G ));where V(G) is a set of 
vertices (or nodes) and E(G) represent edges(links). Let A be an n x n real 
matrix. An eigenvector of A is a vector such that Ax = Xx for some real or 
complex number A. A is called the Eigenvalue of A belonging to Eigenvector v. 
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The set of graph Eigenvalues of the adjacency matrix is called the spectrum of 
the graph. Spectral analysis is a widely used for a range of problems. In general, 
assigning meaning to Eigenvalues is very difficult. They are very context sensitive 
(i.e. relative to the graph type) and they are cryptic in the sense that they store 
many properties of a graph in a single number that does not lend itself to being 
easily used to reconstruct the properties it encodes. However, they are known 
to encode algebraic and topological information relating to a graph in various 
ways. In this paper we contribute toward the investigation of the interpret ability 
of Eigenvalues, specifically with a general method to determine the type and the 
amount of information about a network that each Eigenvalue carries. We analyse 
growing networks ranging from complete graphs to complex random network and 
demonstrate the distinct behaviour of the eigenvalue spectra of different topology 
class. We will show the unique spectral properties of the major random graph 
models, Erdos-Renyi WL small-world US] and scale free [I]. 

2 Methodology 

All graphs in this paper are undirected, so that the matrices are symmetrical 
and the Eigenvalues are real. They also have no loops, so the matrices have a 
zero diagonal and hence a zero trace, so that the Eigenvalues add up to zero. 
We are interested in investigating the behavior of Spec(G ) relative to the Kol¬ 
mogorov complexity K(G). Formally, the Kolmogorov complexity of a string s is 
K(s) = min{|p| : U(p) m s}. That is, the length (in bits) of the shortest program 
p that when running on a universal Turing machine U outputs s upon halting. 
A universal Turing machine U is an abstraction of a general-purpose computer 
that can be programmed to reproduce any computable object, such as a string 
or a network (e.g. the elements of an adjacency matrix). By the Invariance the¬ 
orem Ku only depends on U up to a constant, so as is conventional, the U 
subscript can be dropped. Formally, Ely such that | Kjj(s) — Kjjr(s)\ < 7 where 7 
is a constant independent of U and Uf. Due to its great power, K comes with a 
technical inconvenience (called semi-computability) and it has been proven that 
no effective algorithm exists which takes a string s as input and produces the 
exact integer K(s) as output |8I3| . Despite the inconvenience K can be effec¬ 
tively approximated by using, for example, compression algorithms. Kolmogorov 
complexity can alternatively be understood in terms of uncompressibility. If an 
object, such as a biological network, is highly compressible, then K is small and 
the object is said to be non-random. However, if the object is uncompressible 
then it is considered algorithmically random. 


Algorithmic probability There is another seminal concept in the theory of 
algorithmic information, namely the concept of algorithmic probability mm and 
its related Universal distribution , also called Levin’s probability semi-measure [9]. 
The algorithmic probability of a string s provides the probability that a valid ran¬ 
dom program p written in bits uniformly distributed produces the string s when 
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run on a universal (prefix- fre^) Turing machine U. In equation form this can be 
rendered as m(s) = Yl^u{p)= s 1/2^. That is, the sum over all the programs p 
for which U outputs s and halts. The algorithmic Coding Theorem [2] establishes 
the connection between m(s) and K(s) as | — log 2 m(s) — K(s)\ < 0(1) (Eq. 1), 
where 0(1) is an additive value independent of s. The Coding Theorem im¬ 
plies that m one can estimate the Kolmogorov complexity of a string from its 
frequency by rewriting Eq. (1) as K rn (s) = — log 2 m(s) + 0(1) (Eq. 2). 


Kolmogorov complexity of Unlabeled graphs As shown in B2I , estima¬ 
tions of Kolmogorov complexity may be arrived at by means of the algorith¬ 
mic Coding theorem, using a 2-dimensional lattice as tape for a 2-dimensional 
deterministic universal Turing machine. Hence m(G) is the probability that a 
random computer program acting on a 2-dimensional grid prints out the ad¬ 
jacency matrix of G. Essentially it uses the fact that the more frequently an 
adjacency matrix is produced, the lower its Kolmogorov complexity and vice 
versa. We call this the Block Decomposition Method (BDM) as it requires the 
partition of the adjacency matrix of a graph into smaller matrices using which 
we can numerically calculate its algorithmic probability by running a large set 
of small 2-dimensional deterministic Turing machines, and thence, by applying 
the algorithmic Coding theorem, its Kolmogorov complexity. Then the overall 
complexity of the original adjacency matrix is the sum of the complexity of its 
parts, albeit with a logarithmic penalization for repetitions, given that n rep¬ 
etitions of the same object only adds logn to its overall complexity. Formally, 
the Kolmogorov complexity of a labeled graph G by means of BDM is defined 
as K B dm(G, d) = T,(r u ,n u )eA(G) dxd lo & 2 <™«) + Km(r u ), where K m (r u ) is the 
approximation of the Kolmogorov complexity of the subarrays r u by using the 
algorithmic Coding theorem (Eq. (2)), and A(G)dxd represents the set with ele¬ 
ments (r n , n u ) obtained when decomposing the adjacency matrix of G into non¬ 
overlapping squares of size d by d. In each (r n , n u ) pair, r u is one such square and 
n u its multiplicity (number of occurrences). From now on KBDAd(g,d = 4) will 
be denoted only by K(G) but it should be taken as an approximation to K(G) 
unless otherwise stated (e.g. when taking the theoretical true K(G) value). More 
details of these measures and their application are given in D3- The Kolmogorov 
complexity of a graph G is thus given by: 

K'(G) = mm{K(A(G l))\G L G L(G)} 

where L(G) is the group of all possible labelings of G and Gl a particular label¬ 
ing. In fact K(G) provides a choice for graph canonization, taking the adjacency 
matrix of G with lowest Kolmogorov complexity. Unfortunately, there is almost 
certainly no simple-to-calculate universal graph invariant, whether based on the 
graph spectrum or any other parameters of a graph. In nu, however, we proved 


1 The group of valid programs forms a prefix-free set (no element is a prefix of any 
other, a property necessary to keep 0 < m(s) < 1.) For details see |4I2] . 
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Fig. 1 . Box distributions of Eigenvalues for growing regular graphs normalized by edge 
count. We call these “spectra signatures”. Top left: Spectra signature of a growing com¬ 
plete graph showing that the Eigenvalues normalized by edge count do not carry any 
extra information than may be found in a simple Kolmogorov complexity graph. Top 
right: Spectra signature of a growing cycle graph showing a wider range of different 
Eigenvalues centered around x — 0. The next spectra signatures have an increasing 
number of different Eigenvalues but remain relatively simple given the regular struc¬ 
ture of the graphs they represent. The diversity of Eigenvalues can be captured by 
classical Shannon entropy, but the non-trivial structure can only be captured by algo¬ 
rithmic complexity. Middle left: Spectra signature of a growing wheel graph. Middle 
right: Spectra signature of a growing fan graph. Bottom left: Spectra signature of a 
growing lattice graph. Bottom right: Spectra signature of a growing path graph. Ob¬ 
vious similarities between similar graphs can be recognized: cycles and wheels have 
similar patterns, grids and paths share some similarities too. However, star and fan 
graphs have spectra that show a greater degree of disparity than the spectra of the 
others. 
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that the calculation of the complexity of any labeled graph is a good approxi¬ 
mation to its unlabeled version. 

3 Results 

3.1 Most informative Eigenvalues 
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Fig. 2. Correlation plots of graph complexity vs largest Eigenvalues. On the X-axis are 
graphs (blue/darker curve) sorted by their algorithmic complexity (from lower to higher 
information content) normalized by graph edge count. On the E-axis are the largest 
Eigenvalues for each graph (yellow/lighter curve). Both complexity and Eigenvalues 
are normalized by graph edge count as we are interested in structural information 
contained in both measures beyond information about the graph size. 


It is clear that Eigenvalues carry different information and therefore can be 
of differential informative value. For example, take a complete graph of size 
n. To reconstruct it from its graph spectra it is enough to look at its largest 
Eigenvalue Ai, simply because it indicates the size of the complete graph and 
therefore contains all the information about it- assuming that we know it is a 
complete graph. If we did not know it to be a complete graph then we would 
need to take into account the rest of the n Eigenvalues, but none of them on its 
own would suffice. That is only if a graph with Ai ^ 0 and A* = — 1 with i = 2 
to n uniquely determines a complete graph. 

In Figs. iHHH and [5| a sample of 4913 graphs distributed in 204 classes 
dividing (with possible repetition) the networks into bins of shared topological 
or algebraic properties, such as being a Moore, Haar, Cayley, tree or acyclic 
graph, display various (mostly significant) degrees of negative and positive cor¬ 
relation with one or more Eigenvalues. The number of graphs come from the 
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Fig. 3. Correlation plots of graph complexity vs second largest Eigenvalues. The second 
largest Eigenvalue displays a larger variety of correlations among graph classes, and 
depicted here is a case where it is found that the second value does not carry any 
information about Crown graphs. For Moore graphs the positive correlation is weak, 
and for Haar graphs it is null but noisy, unlike for Crowns. Specific statistics are given 
in Fig. [5] quantifying the correlations across all graph classes. 
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Fig. 4. Correlation plots of graph complexity vs smallest Eigenvalues. Smallest Eigen¬ 
values tend to be positively correlated to graph information content. Depicted here is 
again a known example of a non-informative Eigenvalue for complete graphs, which is 
nonetheless informative in the sense that deleting the effect of size from its information 
content retrieves almost no information, hence all Eigenvalues and the complexity of 
the graph are basically flat (notice T-axis scale). In another example, unlike the second 
largest Eigenvalue, it can be seen that the smallest Eigenvalue does carry information 
about Crown graphs. 
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Fig. 5. Statistics (p) and p-value plots between graph complexity and largest, second 
largest and smallest Eigenvalues of 204 different graph classes including 4913 graphs. 
Clearly the graph class complexity correlates in different ways to different Eigenvalues 
but in most cases this correlation is strong and there is a clear tendency of the largest 
Eigenvalue to be negatively correlated to information content, then a quick transition 
at the second largest and finally a clear positive correlation with the smallest. 
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graphs available in the Mathematica v.10 software built-in repository function 
GraphData[]. The most commonly found case was a negative correlation between 
largest Eigenvalues and graph information content. However, positive correla¬ 
tion and non-trivial differences between next largest and smallest Eigenvalues 
were found and their behavior is highly graph-topology dependent. This suggests 
that while the largest Eigenvalue encodes important structural information of 
the graph, all Eigenvalues may carry some information, with some being more 
or less informative than others. The complete graph is a trivial example of no 
correlation, where it is clear that the Eigenvalue is not providing any structural 
information about the graph other than its size, which is erased when normalized 
by edge count as it is in these plots, hence discounting by any edge count con¬ 
tribution. The degree and type of correlation can be found in Fig. [5j quantified 
by a typical Pearson correlation test. 

If the Eigenvalue behavior of a graph G is flat, then its informat ion-content 
is low or null, except perhaps because of the multiplicity of the value and the 
total number of occurrences of the same value, trivially indicating, for example, 
the size of the network, given that the number of Eigenvalues is equal to the 
number of vertices of G. This also means that Eigenvalues with flat behavior are 
less informative, a fact which enables clear discrimination between interesting 
and uninteresting Eigenvalues, beyond a simple consideration of numerical value 
(numerical values can be different and still not carry any information about a 
graph). 


3.2 Graph spectra behavior of evolving networks 
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Fig. 6. Eigenvalues behavior. The largest Eigenvalue in a random E-R graph of size 
100 vertices for edge density from 0 to 1 (X-axis) is the only one behaving differently 
from the rest. Some properties of the largest Eigenvalue are known, such as being an 
indicator of number of bifurcations, so the greater the edge count the greater its value. 
However, the next Eigenvalues all manifest a common behavior, reaching a maximum 
and describing a concave curve. 
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Fig. 7. Spectra signature of a random E-R graph of size 100 for edge density 0 to 1. 
Clearly for edge density 1, the random graph spectra are simply those of a complete 
graph. 



Fig. 8. Spectra signature of a Watts-Strogatz growing into a 100-node network with 
rewiring probability 0.05. 
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Fig. 9. Spectra signature of a growing Barabasi-Albert network reaching a size of 100 
nodes where a new vertex with 4 edges is added at each step. 


3.3 Spectra signatures 

We compared the spectra signature of an evolving graph to the Box plots of the 
Eigenvalues of the graph over time. Fig. [l] for example, shows the asymptotic 
behavior of each Eigenvalue for well known regular graphs and how the plots 
characterize them with various regular patterns, including cyclic behavior for a 
cycle graph. They also show how the accumulation of Eigenvalues is distributed 
differently for different graphs, with their rate of growth depending on the graph 
type. A complete graph G of size n = |E(G)|, for example, has graph spectra 
(n — l) 1 , (—l) n_1 with its values corresponding to the plot in Fig. [ljleft). When 
the number of different Eigenvalues is small (i.e. their multiplicity is too high) 
and they converge soon to a fixed normalized Eigenvalue, this is an indication 
that the Eigenvalue carries no information or is exhausted after a few evolving 
steps (i.e. no more information can be extracted, or the graph can be character¬ 
ized after a few evolving steps) (see spectra signatures in Fig. [T]) . We undertook 
a novel numerical investigation of the Eigenvalues of growing graphs for differ¬ 
ent classes, shedding light on both known and possibly unexplored properties 
of Eigenvalues for some specific graph types. To this end we calculated what 
we defined as spectra signatures of random and complex networks prior to a 
deeper investigation concerning the information content of synthetic graphs and 
biological networks. 


4 Conclusions 

We have introduced a concept of spectra signatures based upon numerical calcu¬ 
lations of growing networks with different group-theoretic and topological prop¬ 
erties for the study of evolving network behavior. We have then moved toward 
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the information content of these networks via estimating their Kolmogorov com¬ 
plexity by means of entropy, lossless compression and algorithmic probability 
(BDM). 

We have introduced an analysis based on correlation comparisons of each 
Eigenvalue against the information content of a graph to reveal the most informa¬ 
tive Eigenvalue for different graph classes. We found that the largest Eigenvalues 
are negatively correlated to graph complexity even after edge count normaliza¬ 
tion, while the smallest Eigenvalues are in general not correlated or positively 
correlated, with only a couple of cases of negative correlation. While most re¬ 
search has focused on a few of the largest Eigenvalues of a graph spectrum, 
we have shown that in actual fact the smallest Eigenvalues carry a high infor¬ 
mation content as often as the largest. The techniques introduced here can be 
extended to Laplacian matrices, but Laplacian matrices carry only redundant 
information about the degree of the vertices because the original graph can be 
reconstructed from the adjacency matrix alone. Thus the effect of Spec(G ) on 
the Laplacian or simple spectra of G with respect to K(G ) is negligible. For 
Kolmogorov complexity, we have | K(Al(G)) — K(A(G ))| < c, where Al(G ) is 
the Laplacian matrix of G, A(G) is the simple adjacency matrix of G and c is 
the algorithm implementing the Laplacian calculation L = D(G) — A(G ), where 
D(G ) is the diagonal degree matrix of G. We believe this is a novel approach to 
extracting meaning from and thus contributing to the solution of the problem 
of the interpretability of graph spectra, a fundamental step toward applications 
of graph spectra theory in network biology, especially in the context of evolving 
networks-given that some biological models are represented as Ordinary Differ¬ 
ential Equations for which this approach, when applied to the Jacobian matrices 
of the ODEs, is thoroughly relevant. As introduced here, this approach promises 
to be able to reveal specifics about the behavior of a biological network over time 
through the study of Eigenvalues in relation to their information-content. 

One future research direction is the investigation of behavioral differences in 
Eigenvalues of networks representing disease as compared to those of healthy 
networks, both as profiling techniques and as a tool for understanding the di¬ 
rection in which a healthy network may over time progress towards a disease 
state. 
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